Referring to the paper, Image Style Transfer Using Convolutional Neural Networks, by Gatys, the authors discovered that the Image Style Transfer can be done by utilizing a pre-trained convolutional neural network (vgg19) to extract the content and style representations of a content and style image and use them to create a new image (See more details in the paper or this Udacity's lab here)
I have performed an experiment to apply the Style Transfer learning in another notebook StyleTransfer_Experiment_Results.ipynb. In this notebook, I will be focusing on gaining a better understanding of the Content Representations and Style Representations of the images used in the experiment

In short, based on the findings described in the paper, we can extract the Content Representation and Style Representations of an image by using the image to perform a forward through a pre-trained convolutional neural network vgg19 and get those representation by:
To recap on what I did in the experiment in another notebook StyleTransfer_Experiment_Results.ipynb, I performed the Style Transfer process using a content image which is an image of a God Statue in the Grand Palace, Bangkok, Thailand, and a style image which is a Thai-artistic image to create a new image as shown below:
%matplotlib inline
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'
import torch
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
from src.StyleTransfer import StyleTransfer
from src.internal import im_convert
content_image = 'images/GodStatute_GrandPalace_Bangkok.jpg'
style_image = 'images/mah_tip.jpg'
st = StyleTransfer(content_image=content_image, style_image=style_image)
st.showInputImages()
print('Target Image:')
target_im = Image.open('output/god_statue_small_lr.png')
target_im
To visualize the Content Representation of the content image, we will inspect the output of the conv4_2 layer when forward passing the content image through the vgg19 network as shown in the picture below:

Note that when constructing a StyleTransfer object, this operation has already been done and the tensor data of the interested convolutional layers can be retrieved by accessing the data property of the object
# Inspect available data stored in the object
st.data.keys()
# Then, retrieve the output of the 'conv4_2' layer
feat_content_all = st.data['features_content']
feat_content = feat_content_all['conv4_2']
feat_content.size()
# Remove the first dimension since we don't need it
feat_content = feat_content.squeeze()
feat_content.size()
There are a total of 512 filters in this conv4_2 layer and, as you can see in the images below, each filter detect a different object and shape arrangement of the content image and we will use the output of those filters as our Content Representation in the Style Transfer process
# Inspect an output of each filter via
st.showContentRepresentation()
To visualize the Style Representations of the style image, we will
when forward passing the style image through the vgg19 network as shown in the picture below:

Below is an illustration of how to compute a grame matrix for a particular convolutions layer:

Note that when constructing a StyleTransfer object, this operation has already been done and Gram Matrices can be retrieved by accessing the data property of the object
style_grams = st.data['style_grams']
style_grams
Below are heatmaps of the computed gram matrices in each convolutional layer.
Note that most values in the diagonal lines have a higher value (bright color) which makes sense because it is a value of a feature map multiplies with its transpose!
However, notice that there are many cells that are no on the diagonal line but have a bright color. Those are an indication of two feature maps that are very similar. (Note that the deeper layer has more filters and you might need to open the plot outside the notebook to see those cells)
# Create a heatmap of a computed gram matrix in each convolutional layer
st.showStyleRepresentations()
Generally, the earlier layers will use to detect the LARGER style artifacts as you can see in the images of the output of each filter in the conv1_1 layer below
st.showStyleFiltersAtLayer('conv1_1')
Let's inspect the computed gram matrix for this layer. As mentioned earlier, the top values are those on the diagonal lines but there are some cells that have a very large number (e.g. (46, 61), (61, 60), (24, 54)) which have even a larger value than many values on the diagonal lines.
st.showStyleRepresentations(layer_names=['conv1_1'])
# Get the top 20 cells
out = st.getGramMatrixIndicesSortedDescendingly('conv1_1')
out[:20]
Below are the output images of filters with the top values (excluding the diagonal line) in the gram matrix.
You can easily see that those output images have very similar colors and textures!
st.showTopMatchedStyleFilters('conv1_1')
Let's repeat the same process for the conv2_1, conv3_1, conv4_1, and conv5_1 layers
As you can see in the images below, as we go deeper in the network, a convolutional layer will detect and emphasize more on SMALLER features!
## Inspect the "conv2_1" layer
st.showStyleFiltersAtLayer('conv2_1')
st.showTopMatchedStyleFilters('conv2_1')
## Inspect the "conv3_1" layer
st.showStyleFiltersAtLayer('conv3_1')
st.showTopMatchedStyleFilters('conv3_1')
## Inspect the "conv4_1" layer
st.showStyleFiltersAtLayer('conv4_1')
st.showTopMatchedStyleFilters('conv4_1')
## Inspect the "conv5_1" layer
st.showStyleFiltersAtLayer('conv5_1')
st.showTopMatchedStyleFilters('conv5_1')